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Cellular neural networks for NP-hard optimization problems 



00 
O 

o 

(N 
X> 

PL, 
OO 



i 

C 

O 

o 



> 
o 
in 



o 

OO 
O 



X 



M. Ercsey-Ravasz 1 ' 2 , T. Roska 2 and Z. Neda 1 
1 Babes-Bolyai University, Faculty of Physics, RO-400084, Cluj, Romania 
Peter Pdzmdny Catholic University, Faculty of Information Technology, HU-1083, Budapest, Hungary 

(Dated: February 11, 2008) 

Nowadays, Cellular Neural Networks (CNN) are practically implemented in parallel, analog com- 
puters, showing a fast developing trend. Physicist must be aware that such computers are appropri- 
ate for solving in an elegant manner practically important problems, which are extremely slow on 
the classical digital architecture. Here, CNN is used for solving NP-hard optimization problems on 
lattices. It is proved, that a CNN in which the parameters of all cells can be separately controlled, 
is the analog correspondent of a two-dimensional Ising type (Edwards- Anderson) spin-glass system. 
Using the properties of CNN computers a fast optimization method can be built for such problems. 
Estimating the simulation time needed for solving such NP-hard optimization problems on CNN 
based computers, and comparing it with the time needed on normal digital computers using the 
simulated annealing algorithm, the results are astonishing: CNN computers would be faster than 
digital computers already at 10 x 10 lattice sizes. Hardwares realized nowadays are of 176 x 144 
size. Also, there seems to be no technical difficulties adapting CNN chips for such problems and the 
needed local control is expected to be fully developed in the near future. 

PACS numbers: 07.05.Mh, 89.20.Ff, 05.10.-a, 87.85.Lf 



I. INTRODUCTION 

Solving NP-hard problems is a key task when testing 
novel computing paradigms. These complex problems 
frequently appear in physics, life sciences, biometrics, lo- 
gistics, database search... etc., and in most cases they are 
associated with important practical applications as well 
The deficiency of solving these problems in a rea- 
sonable amount of time is one of the most important 
limitation of digital computers, thus all novel paradigms 
are tested in this sense. Quantum computing for example 
is theoretically well-suited for solving NP-hard problems, 
but the technical realization of quantum computers seems 
to be quite hard. Here we present a computing paradigm 
based on Cellular Neural Networks (CNN) with good per- 
spectives for fast NP-hard optimization. The advantage 
of this approach relative to quantum computing is that 
several practical realizations are already available. 

After the idea of CNN appeared in 1988 Q a detailed 
plan for a CNN computer was developed in 1993 
Since than the chip had a fast developing trend and the 
latest - already commercialized - version is the EYE-RIS 
chip with lattice size 176 x 144, mainly used as a vi- 
sual microprocessor Q . The physics community can also 
benefit from CNN based computers. It has been proven 
in several previous studies that this novel computational 
paradigm is usefull in solving partial differential equa- 
tions \h, 6], studying cellular automata models @, 0, do- 
ing image processing and making Monte Carlo simu- 
lations on lattice models [13, HH • Here it is shown that 
NP-hard optimization problems can be also effectively 
solved using such an approach. 

In the next section we will present briefly the struc- 
ture and dynamics of cellular neural networks together 
with the most developed applications of CNN computing. 
In the third section we prove that a CNN computer on 



which the parameters of each cell can be separately con- 
trolled, is the analog correspondent of a locally coupled 
two-dimensional spin-glass system. Using the properties 
of CNN computers a fast optimization algorithm can be 
thus developed. The local control of the parameters of 
each cell is already partially realized on some hardwares 
(for details see section 2.) and it is expected to be fully 
functional in the near future. Beside the fundamental 
interest for solving NP hard problems, the importance of 
this study consists also in motivating the development of 
hardwares in such direction. 



II. CELLULAR NEURAL NETWORKS AND 
CNN COMPUTERS 

The CNN Universal Machine (CNN-UM) Q is one spe- 
cial case of cellular wave computers [Hj] in which compu- 
tation is achieved using the spatial-temporal dynamics of 
a cellular neural network Q. The CNN is composed by 
L x L cells placed on a square lattice and interconnected 
through their neighbors [2j. Usually the 4 nearest and 
the 4 next-nearest neighbors (Moore neighborhood) are 
considered. Each cell is characterized by a state value: 
Xij(t) representing a voltage in the circuit of the cell. 
The cell has also an input value (voltage) mj, which is 
constant in time and can be defined at the beginning 
of an operation. The third characteristic quantity of the 
cell is the output value j (t) . This is equivalent with the 
Xi_j state value in a given range. More specifically it is 
a piece- wise linear function, bounded between —1 (called 
as white) and 1 (black): y = f(x) = \{\ x+1 | — | x—1 ). 

The wiring between neighboring cells assures that the 
state value of each cell can be influenced by the input 
and output values of its neighbors. The equation gov- 
erning the dynamics of the CNN cells results from the 
time-evolution of the equivalent circuits. Supposing the 
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8 Moore neighbor interactions it has the following form 
ft 

das- (t) 1+1 3+1 

k=i-l l=j-l 

4+1 j+1 

+ X] B i<r,k,l u k,l + Zi,j (1) 

fe=i-i ;=j-i 

where i, j denotes the coordinates of the cell and the sum- 
mation indeces k, I are for its neighbors. Self-interaction 
(k = i,l = j) is also possible. The set of parameters 
{A, B, z} is called a template and controls the whole sys- 
tem. An operation is performed by giving the initial 
states of the cells, the input image (the input values of 
all cells) and by defining a template. The states of all 
cells will vary in parallel and the result of the operation 
will be the final steady state of the CNN. If the state 
values (xij) of all cells remain bounded in the [—1,1] re- 
gion (i.e. yij = Xij holds for each cell at any time t), 
than each operation is equivalent with solvinga differ- 
ential equation defined by the template itself [!, [f| [l3| . 
When Xij does not remain bounded, than the piece-wise 
linear function described at the definition of the output 
value Ui,j, takes an important role. The final steady state 
will not be simply the solution of the differential equa- 
tion, and this case can be used for defining other useful 
operations as wel l [l3j . 

The CNN-UM Q is a programmable analogic (analog 
& logic) cellular wave computer. Beside the analog cir- 
cuits described by Eq[TJ each cell contains also a logic 
unit, local analog and logic memories and a local com- 
munication and control unit. The logic unit and logic 
memories are included to complement the analog com- 
putation. In this manner basic logic operations can be 
performed without defining complicated templates for it. 
In the local logic memories one can save a binary value 
(1 and respectively), and in the local analog memories 
it is possible to save real values between —1 and 1. Since 
the CNN array is mainly used for image processing and 
acquisition, the binary values are often referred as black 
and white, and the real values bounded between —1 and 
1 are mapped in a gray-scale scheme. Beside these local 
units, the CNN-UM has also a global analog program- 
ming unit which controls the whole system, making it a 
programmable computer. It can be easily connected to 
PC type computers and programmed with special lan- 
guages, for example the Analogic Macro Code (AMC). 

The physical implementations of these computers are 
numerous and widely different: mixed-mode CMOS, em- 
ulated digital CMOS, FPGA, and also optical. For prac- 
tical purposes the most promising applications are for im- 
age processing, robotics or sensory computing purposes 
Q , so the main practical drive in the mixed- mode imple- 
mentations was to build a visual microprocessor [l3T ] . In 
the last decades the size of the engineered chips was con- 
stantly growing, the new cellular visual microprocessor 
EYE-RIS [i[ for example has 176 x 144 processors, each 



cell hosting also 4 optical sensors. Parallel with increas- 
ing the lattice size of the chips, engineers are focusing on 
developing multi-layered, 3 dimensional chips as well. 

For the first experimental versions the templates (cou- 
pling parameters) arc defined identical for all cells. This 
means that for example A(i, j; i + l, j) is the same for all 
coordinates. In such way, on the two-dimensional 
CNN chip, all the A couplings are defined by a single 3x3 
matrix. Totally, 9 + 9 + 1 = 19 parameters are needed to 
define the whole, globally valid, template ({A,B,z}). On 
the latest version of the CNN chips (ACE16K, EYE-RIS) 
the z(i,j) parameter can already be locally varied. It is 
expected that on newer chips one will be able to sepa- 
rately control also the A(i, j; k, I) and B(i, j; k, I) connec- 
tions as well. 

Many applications ideal for the analogic and parallel 
architecture of the CNN-UM were already developed and 
tested. Studies dealing with partial differential equations 
[H, [f| or cellular automata models @, Q prove this. In 
some of our latest publications we have shown that CNN 
computers are suitable also for stochastic simulations. 
The natural noise of the analog chip can be effectively 
used to generate random numbers approximately 4 times 
faster than on digital computers [l(|. We also presented 
experiments in which the site-percolation problem and 
the two-dimensional Ising model was properly solved on 
the ACE16K chip (with 128 x 128 cells) pjlGjl. 



III. NP-HARD OPTIMIZATION ON CNN 
ARCHITECTURES 

The aim of the present study is to prove that CNN 
computing is also suitable for solving effectively complex 
optimization problems on spin-glass type lattice models. 
We consider a two-dimensional CNN where the templates 
(parameters of Eq. [T]) can be locally controlled. Ma- 
trix A is considered symmetric A(i,j;k, I) = A(k,l;i,j), 
— 1 for all and the elements are bounded 
A{i,j;k,l) E [—1,1] and (k, I) denote two neigh- 

boring cells). Matrix B, which controls the effect of the 
input image, will be taken simply as: B(i,j; = b and 
B(i,j;k,l) = 0, ^ {k,l}. The parameter z is cho- 

sen as z = 0, so finally our template is defined by {A, b} 
alone. 

In an earlier work Chua et al. (_2| defined a Lyapunov 
function for the CNN, which behaves like the "energy" 
(Hamiltonian) of the system. For the CNN defined above 
it can be written simply as 

E {t) = - zl A il j.k,iVi,jyk,i - byluijujj, (2) 

<i,j;k,l> i,j 

where < i,j;k,l > denotes pairs of Moore neighbors, 
each pair taken only once in the sum. j/jj denotes the 
output value of each cell and Uij stands for an arbitrary 
input image. By choosing the parameter 6 = 0, the en- 
ergy of this special CNN is similar with the energy of 
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an Ising type system on square lattice with locally vary- 
ing coupling constants. The difference is that Ising spins 
are defined as ±1, while here we have continuous val- 
ues between [—1,1]. Since the A(i,j;k,l) coupling con- 
stants can be positive and negative as well, locally cou- 
pled spin-glasses can be mapped in such systems. In 
the following we will be especially interested in the case 
when the A(i,j;k, I) couplings lead to a frustration and 
the quenched disorder in the system is similar with that 
of spin-glass systems fflil ITU). 

The Lyapunov function defined by Chua et al. has 
two important properties (2j: 1.) it is always a mono- 
tone decreasing function in time, dE/dt < 0, so starting 
from an initial condition E can only decrease during the 
dynamics of the CNN. 2.) the final steady state is a 
local minimum of the energy: dE/dt = 0. In addition 
to these, our CNN has also another important property: 
due to the fact that all self-interaction parameters are 
= 1, it can be shown that the output values 
of the cells in a final steady state will be always either 1 
or —1. The local minima achieved by the CNN is thus 
an Ising-like configuration. We can conclude thus that 
starting from any initial condition the final steady state 
of the template - meaning the result of an operation - will 
be always a local minimum of the spin-glass type Ising 
spin system with local connections defined by matrix A. 
The fact that one single operation is needed for finding a 
local minimum of the energy, gives us hope to develope 
fast optimization algorithms. 

As already emphasized, the complex frustrated case 
(locally coupled spin-glass type system), where the A 
coupling parameters generates a non-trivial quenched 
disorder, will be considered here. The minimum energy 
configuration of such systems is searched by an algorithm 
which is similar with the well-known simulated annealing 
method (l6j . The noise is included with random input 
images (itjj values in eq. (2) acting as an external locally 
variable magnetic field. The strength of this field is gov- 
erned through parameter b. Whenever b is different from 
zero, our CNN template minimizes the energy with form 
[SJ the first part of it being the energy of the considered 
spin-glass type model and the second part an additional 
term, which gets minimal when the state of the system is 
equal to the input image (the external magnetic field). If 
b is large, the result will be the input image itself, if b = 
the result is a local minimum of the pure Ising-type sys- 
tem. For values in between, our result is a " compromise" 
between the two cases. Slowly decreasing the value of 
6 will result in a process similar with simulated anneal- 
ing, where the temperature of the system is consecutively 
lowered. First big fluctuations of the energy are allowed, 
and by decreasing this we slowly drive the system to a 
low energy state. Since the method is a stochastic one, 
we can of course never be totally sure that the global 
minimum will be achieved. 

The steps of the algorithm are the following: 
1. One starts from a random initial condition x, and 
b = 5 (with this value the result of the template is al- 



most exactly the same as the input image). 

2. A binary random input image u is generated with 1/2 
probability of black (1) pixels, 

3. Using the x initial state and the u input image the 
CNN template is applied, 

4. The value of b is decreased with steps Ab, 

5. Steps 2-4 are repeated until b — is reached. The 
results of the previous step (minimization) is considered 
always as the initial state for the next step. 

6. When reaching b = the image (Ising spin configura- 
tion) is saved and the energy is calculated. 

In the classical simulated annealing algorithm several 
thousands of steps for a single temperature are needed. 
Here the CNN template working totally in parallel re- 
places all these steps. Similarly with choosing the cool- 
ing rate in simulated annealing, choosing the value of Ab 
is also a delicate problem. A proper value providing an 
acceptable compromise between the quality of the results 
and speed of the algorithm has to be found. For each sys- 
tem size one can find an optimal value of A&, but as one 
would expect this is rapidly decreasing by increasing the 
system size. It is much more effective, both for perfor- 
mance (meaning the probability of finding the real global 
optimum) and speed, to choose a constant Ab — 0.05 step 
and repeat the whole cooling process several times. As 
a result, several different final states will be obtained, 
and we have a higher probability to get the right global 
minima between these. 

For testing the efficiency of the algorithm one needs 
to measure the number of steps necessary for finding the 
right global minima. To do this, one has to previously 
know the global minima. In case of small systems with 
L = 5, 6 this can be obtained by a quick exhaustive search 
in the phase-space. For bigger systems the classical sim- 
ulated annealing algorithm was used. The temperature 
was decreased with a rate of 0.99 (Tfi na i/T ini ) and 1000 
Monte Carlo steps were performed for each temperature. 

In the present work spin-glass systems with 
A(i,j;k,l) — ±1 local connections were studied. 
The p probability of the positive bonds was varied 
(influencing the amount of frustration in the system), 
and local interactions with the 8 Moore neighbors were 
considered. For several p densities of the positive links 
and various system sizes, we calculated the average 
number of steps needed for finding the energy minimum. 
As naturally is expected for the non-trivial frustrated 
cases, the needed simulation time exponentially increases 
with the system size. As an example, the p = 0.4 case 
is shown on Fig. [TJi. Circles represent the estimated 
time on CNN computers and stars illustrate the time 
measured on a Pentium 4, 3.4 GHz computer using 
the classical simulated annealing method. For the 
calculation time on CNN computers we have used a 
convention (arguments given in the next paragraph) 
that roughly 1000 simulation steps are made in 1 second. 
As observable in the figure, we could made estimates for 
relatively small system sizes only (L > 12). The reason 
for this is that we had to simulate also the operations 
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FIG. 1: a) Time needed to reach the minimum energy as a 
function of the lattice size L. Circles are for estimates on 
CNN computers and stars are simulated annealing results on 
3.4 GHz Pentium 4. Results are averaged on 10000 different 
configurations with p = 0.4 probability of positive bonds. For 
the CNN algorithm Afe = 0.05 was chosen. For simulated 
annealing the initial temperature was To = 0.9 , final tem- 
perature Tf = 0.2 and the decreasing rate of the temperature 
was fixed as 0.99. b) Number of steps needed for reaching the 
presumed global minima as a function of the probability p, 
of positive connections. Simulation results with the proposed 
algorithm on a CNN lattice with size L = 7. 



on the CNN chip, and for large lattices a huge system 
of partial differential equations had to be solved. This 
process gets quite slow for bigger lattices. 

The needed average number of steps to reach the esti- 
mated energy minima depends also on the p probability 
of the positive connections in the system. On FigQJ) 
we illustrate this for a system with size L — 7. To ob- 
tain this data for each p value 5000 different systems 
were analyzed. As observable on FigJTJ? the system is 
almost equally hard to solve for all p values in the rage 
of p e (0,0.6). 

Finally, let us have some thoughts about the estimated 
speed of such an optimization algorithm. As mentioned 
earlier, on the nowadays available CNN chips, only pa- 
rameter z can be locally varied, parameters A and B are 
3x3 matrices, uniformly applied for all cells. The rea- 
son for no local control of A and B seems to be simply 
the lack of motivations. In image processing applications 
no really useful algorithms were developed, which would 
require these locally variable connections. Realizing the 
local control of A and B is technically possible and is ex- 
pected to be included in the newer versions of the CNN 
chips. This modification would not change the properties 
and the speed of the chip, only the control unit and tem- 
plate memories would become more complicated. Also, 
introducing the connection parameters in the local mem- 
ories of the chip would take a slightly longer time. In 
the specific problem considered here the connection pa- 
rameters have to be introduced only once for each prob- 
lem, so this would not effect in a detectable manner the 
speed of calculations. Based on our previous experience 
with the ACE16K chip (with sizes 128 x 128) [la [ID] we 
can make an estimation of the speed for the presented 
optimization algorithm. This chip with its parallel archi- 
tecture solves one template in a very short time - of the 



order of microseconds. For each step in the algorithm one 
also needs to generate a random binary image. This pro- 
cess is already 4 times faster on the ACE16K chip than 
on a 3.4 GHz Pentium 4 computer and needs around 
100/js (see [10(). It is also helpful for the speed, that in 
the present algorithm it is not needed to save informa- 
tion at each step, only once at the end of each cooling 
process. Saving an image takes roughly 10 milliseconds 
on the ACE16K, but this is done only once after sev- 
eral thousand of simulation steps. Making thus a rough 
estimate for our algorithm, a chip with similar proper- 
ties like the ACE16K should be able to compute between 
1000 — 5000 steps in one second, independently of the lat- 
tice size. Using the lower estimation value (1000 steps 
/second) and following up the number of steps needed in 
case of p — 0.4, the estimated average time for solving 
one problem is plotted as a function of the lattice size in 
Fig. [TJi (empty circles). Comparing this with the speed of 
simulated annealing (SA) performed on a 3.4 GHz Pen- 
tium 4 (stars on Fig. [1^), the results for larger lattice 
sizes are clearly in favour of the CNN chips. For testing 
the speed of simulated annealing we used the following 
parameters: initial temperature To = 0.9 , final temper- 
ature Tf — 0.2, decreasing rate of the temperature 0.99. 
Results were avereged for 10000 different bond distribu- 
tions. From Fig. QJi it results that the estimated time 
needed for the presented algorthim on a CNN chip would 
be smaller than simulated annealing already at 10 x 10 
lattice sizes. 

Spin-glass like systems have many applications in 
which global minimum is not crucial to be exactly found, 
the minimization is needed only with a margin of error. 
In such cases the number of requested steps will decrease 
drastically. As an example in such sense, it has been 
shown that using spin-glass models as error-correcting 
codes, their cost-performance is excellent [l7[, and the 
systems are usually not even in the spin-glass phase. In 
this manner by using the CNN chip, finding acceptable 
results could be very fast, even on big lattices. 



IV. CONCLUSION 

A cellular neural network with locally variable param- 
eters was used for finding the optimal state of locally 
coupled, two-dimensional, Ising type spin-glass systems. 
By simulating the proposed optimization algorithm on 
a CNN chip, where all connections can be locally con- 
trolled, very good perspectives for solving such NP hard 
problems were predicted: CNN computers could be faster 
than digital computers already at a 10 x 10 lattice size. 
Chips with 2 and 3 layers of cells were also produced 
(CACElk, XENON) and increasing the number of lay- 
ers is expected in the near future. This further extends 
the number of possible applications. On two layers is 
possible to map already a spin system with any connec- 
tion matrix (even globally coupled spins) and also other 
important NP-hard problems (e.g. K-SAT) may become 
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